Will County
Simple Denoising Diffusion Language Models
Zhu, Huaisheng, Chen, Zhengyu, Zhou, Shijie, Xie, Zhihui, Yuan, Yige, Guo, Zhimeng, Xu, Siyuan, Zhang, Hangfan, Honavar, Vasant, Xiao, Teng
Diffusion models have recently been extended to language generation through Masked Diffusion Language Models (MDLMs), which achieve performance competitive with strong autoregressive models. However, MDLMs tend to degrade in the few-step regime and cannot directly adopt existing few-step distillation methods designed for continuous diffusion models, as they lack the intrinsic property of mapping from noise to data. Recent Uniform-state Diffusion Models (USDMs), initialized from a uniform prior, alleviate some limitations but still suffer from complex loss formulations that hinder scalability. In this work, we propose a simplified denoising-based loss for USDMs that optimizes only noise-replaced tokens, stabilizing training and matching ELBO-level performance. Furthermore, by framing denoising as self-supervised learning, we introduce a simple modification to our denoising loss with contrastive-inspired negative gradients, which is practical and yield additional improvements in generation quality.
- North America > United States > Texas (0.04)
- North America > United States > Oregon (0.04)
- North America > United States > Nebraska (0.04)
- (11 more...)
- Leisure & Entertainment > Sports > Football (1.00)
- Leisure & Entertainment > Sports > Soccer (0.67)
Does Local News Stay Local?: Online Content Shifts in Sinclair-Acquired Stations
Wanner, Miriam, Hager, Sophia, Field, Anjalie
Local news stations are often considered to be reliable sources of non-politicized information, particularly local concerns that residents care about. Because these stations are trusted news sources, viewers are particularly susceptible to the information they report. The Sinclair Broadcast group is a broadcasting company that has acquired many local news stations in the last decade. We investigate the effects of local news stations being acquired by Sinclair: how does coverage change? We use computational methods to investigate changes in internet content put out by local news stations before and after being acquired by Sinclair and in comparison to national news outlets. We find that there is clear evidence that local news stations report more frequently on national news at the expense of local topics, and that their coverage of polarizing national topics increases.
- North America > United States > Montana > Missoula County > Missoula (0.28)
- North America > United States > Rhode Island > Providence County > Providence (0.28)
- Asia > Middle East > Israel (0.14)
- (46 more...)
- Media > News (1.00)
- Leisure & Entertainment > Sports > Football (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)
- Health & Medicine > Therapeutic Area > Oncology (0.92)
Millions of Americans under dangerous freeze warning TODAY as temperatures plunge to 22 F
Ominous warning for humanity as birds suddenly adopt'unsettling' behavior Meghan is accused of'giggling as model stumbles on the catwalk': More Paris Fashion Week disasters emerge, including awkward moment with Kristin Scott Thomas More girls are starting their periods younger than ever before - scientists think they've finally found what's causing it The TRUTH to the doting mother who slaughtered her children and husband told by those she'd been quietly tormenting for years Insiders confirm what everyone suspects about Taylor Swift and Blake Lively... the private apology... and how any future friendship hangs on one humiliating condition Outrage as Baltimore's Dem mayor spends $164k of taxpayer cash on ultra-luxurious new SUV I have no sympathy for them - but this disturbing new trend isn't the answer: JANA HOCKING Taylor Swift reveals truth behind raunchy song about Travis Kelce's manhood Revealed: Which slimming jab REALLY works best. The doctors' ultimate expert guide on which to pick, how to save money, beat every side effect... and what you need to know about the'golden dose' Functioning alcoholics hide in plain sight... so are YOU one? Trump brands NFL's Bad Bunny Super Bowl halftime show selection'absolutely ridiculous' The troubled background of delivery man stabbed by Mark Sanchez... as he launches million-dollar lawsuit and sparks civil war at Fox Millions of Americans are facing a dangerous freeze warning on Tuesday as temperatures drop below freezing across multiple states. Sub-freezing temperatures as low as 22 to 30 F are expected in parts of Wisconsin, Minnesota, North Dakota, South Dakota, Michigan, Colorado, Wyoming and Idaho . The National Weather Service (NWS) issued the warning for tonight into Wednesday morning, ending between 8 and 10am local time, depending on the state and county .
- North America > United States > Wisconsin (0.61)
- North America > United States > North Dakota (0.61)
- North America > United States > Minnesota (0.61)
- (16 more...)
- Transportation > Air (1.00)
- Media > Television (1.00)
- Media > Music (1.00)
- (4 more...)
In-memory Training on Analog Devices with Limited Conductance States via Multi-tile Residual Learning
Li, Jindan, Wu, Zhaoxian, Liu, Gaowen, Gokmen, Tayfun, Chen, Tianyi
Analog in-memory computing (AIMC) accelerators enable efficient deep neural network computation directly within memory using resistive crossbar arrays, where model parameters are represented by the conductance states of memristive devices. However, effective in-memory training typically requires at least 8-bit conductance states to match digital baselines. Realizing such fine-grained states is costly and often requires complex noise mitigation techniques that increase circuit complexity and energy consumption. In practice, many promising memristive devices such as ReRAM offer only about 4-bit resolution due to fabrication constraints, and this limited update precision substantially degrades training accuracy. To enable on-chip training with these limited-state devices, this paper proposes a \emph{residual learning} framework that sequentially learns on multiple crossbar tiles to compensate the residual errors from low-precision weight updates. Our theoretical analysis shows that the optimality gap shrinks with the number of tiles and achieves a linear convergence rate. Experiments on standard image classification benchmarks demonstrate that our method consistently outperforms state-of-the-art in-memory analog training strategies under limited-state settings, while incurring only moderate hardware overhead as confirmed by our cost analysis.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- (10 more...)
- Energy (0.66)
- Semiconductors & Electronics (0.64)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Inferring Effects of Major Events through Discontinuity Forecasting of Population Anxiety
Mangalik, Siddharth, Deshpande, Ojas, Ganesan, Adithya V., Clouston, Sean A. P., Schwartz, H. Andrew
Estimating community-specific mental health effects of local events is vital for public health policy. While forecasting mental health scores alone offers limited insights into the impact of events on community well-being, quasi-experimental designs like the Longitudinal Regression Discontinuity Design (LRDD) from econometrics help researchers derive more effects that are more likely to be causal from observational data. LRDDs aim to extrapolate the size of changes in an outcome (e.g. a discontinuity in running scores for anxiety) due to a time-specific event. Here, we propose adapting LRDDs beyond traditional forecasting into a statistical learning framework whereby future discontinuities (i.e. time-specific shifts) and changes in slope (i.e. linear trajectories) are estimated given a location's history of the score, dynamic covariates (other running assessments), and exogenous variables (static representations). Applying our framework to predict discontinuities in the anxiety of US counties from COVID-19 events, we found the task was difficult but more achievable as the sophistication of models was increased, with the best results coming from integrating exogenous and dynamic covariates. Our approach shows strong improvement ($r=+.46$ for discontinuity and $r = +.65$ for slope) over traditional static community representations. Discontinuity forecasting raises new possibilities for estimating the idiosyncratic effects of potential future or hypothetical events on specific communities.
- North America > United States > New York > Suffolk County > Stony Brook (0.05)
- North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
- Oceania > Australia (0.04)
- (6 more...)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.93)
- Research Report > Strength High (0.68)
Demonstrating Interoperable Channel State Feedback Compression with Machine Learning
Korpi, Dani, Wang, Rachel, Wang, Jerry, Ibrahim, Abdelrahman, Nuzman, Carl, Wang, Runxin, Mestav, Kursat Rasim, Zhang, Dustin, Saniee, Iraj, Winston, Shawn, Pavlovic, Gordana, Ding, Wei, Hillery, William J., Hao, Chenxi, Thirunagari, Ram, Chang, Jung, Kim, Jeehyun, Kozicki, Bartek, Samardzija, Dragan, Yoo, Taesang, Maeder, Andreas, Ji, Tingfang, Viswanathan, Harish
Neural network-based compression and decompression of channel state feedback has been one of the most widely studied applications of machine learning (ML) in wireless networks. Various simulation-based studies have shown that ML-based feedback compression can result in reduced overhead and more accurate channel information. However, to the best of our knowledge, there are no real-life proofs of concepts demonstrating the benefits of ML-based channel feedback compression in a practical setting, where the user equipment (UE) and base station have no access to each others' ML models. In this paper, we present a novel approach for training interoperable compression and decompression ML models in a confidential manner, and demonstrate the accuracy of the ensuing models using prototype UEs and base stations. The performance of the ML-based channel feedback is measured both in terms of the accuracy of the reconstructed channel information and achieved downlink throughput gains when using the channel information for beamforming. The reported measurement results demonstrate that it is possible to develop an accurate ML-based channel feedback link without having to share ML models between device and network vendors. These results pave the way for a practical implementation of ML-based channel feedback in commercial 6G networks.
- North America > United States > California > San Diego County > San Diego (0.05)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- North America > United States > New Jersey (0.04)
- (4 more...)
- Research Report > New Finding (0.48)
- Research Report > Promising Solution (0.34)
- Telecommunications (1.00)
- Information Technology > Security & Privacy (0.46)
A RAG-Based Multi-Agent LLM System for Natural Hazard Resilience and Adaptation
Xie, Yangxinyu, Jiang, Bowen, Mallick, Tanwi, Bergerson, Joshua David, Hutchison, John K., Verner, Duane R., Branham, Jordan, Alexander, M. Ross, Ross, Robert B., Feng, Yan, Levy, Leslie-Anne, Su, Weijie, Taylor, Camillo J.
Large language models (LLMs) are a transformational capability at the frontier of artificial intelligence and machine learning that can support decision-makers in addressing pressing societal challenges such as extreme natural hazard events. As generalized models, LLMs often struggle to provide context-specific information, particularly in areas requiring specialized knowledge. In this work we propose a retrieval-augmented generation (RAG)-based multi-agent LLM system to support analysis and decision-making in the context of natural hazards and extreme weather events. As a proof of concept, we present WildfireGPT, a specialized system focused on wildfire hazards. The architecture employs a user-centered, multi-agent design to deliver tailored risk insights across diverse stakeholder groups. By integrating natural hazard and extreme weather projection data, observational datasets, and scientific literature through an RAG framework, the system ensures both the accuracy and contextual relevance of the information it provides. Evaluation across ten expert-led case studies demonstrates that WildfireGPT significantly outperforms existing LLM-based solutions for decision support.
- North America > United States > Oregon > Washington County > Beaverton (0.14)
- North America > United States > Colorado > Denver County > Denver (0.14)
- North America > United States > Massachusetts > Suffolk County > Boston (0.14)
- (11 more...)
- Workflow (0.93)
- Personal > Interview (0.45)
- Research Report > Experimental Study (0.45)
- Materials (1.00)
- Law (1.00)
- Health & Medicine (1.00)
- (7 more...)
Jacobian Sparse Autoencoders: Sparsify Computations, Not Just Activations
Farnik, Lucy, Lawson, Tim, Houghton, Conor, Aitchison, Laurence
Sparse autoencoders (SAEs) have been successfully used to discover sparse and human-interpretable representations of the latent activations of LLMs. However, we would ultimately like to understand the computations performed by LLMs and not just their representations. The extent to which SAEs can help us understand computations is unclear because they are not designed to "sparsify" computations in any sense, only latent activations. To solve this, we propose Jacobian SAEs (JSAEs), which yield not only sparsity in the input and output activations of a given model component but also sparsity in the computation (formally, the Jacobian) connecting them. With a na\"ive implementation, the Jacobians in LLMs would be computationally intractable due to their size. One key technical contribution is thus finding an efficient way of computing Jacobians in this setup. We find that JSAEs extract a relatively large degree of computational sparsity while preserving downstream LLM performance approximately as well as traditional SAEs. We also show that Jacobians are a reasonable proxy for computational sparsity because MLPs are approximately linear when rewritten in the JSAE basis. Lastly, we show that JSAEs achieve a greater degree of computational sparsity on pre-trained LLMs than on the equivalent randomized LLM. This shows that the sparsity of the computational graph appears to be a property that LLMs learn through training, and suggests that JSAEs might be more suitable for understanding learned transformer computations than standard SAEs.
- Europe > United Kingdom > England > Greater Manchester > Bury (0.04)
- North America > United States > Illinois > Will County (0.04)
- Europe > Switzerland (0.04)
- (3 more...)
LongProc: Benchmarking Long-Context Language Models on Long Procedural Generation
Ye, Xi, Yin, Fangcong, He, Yinghui, Zhang, Joie, Yen, Howard, Gao, Tianyu, Durrett, Greg, Chen, Danqi
Existing benchmarks for evaluating long-context language models (LCLMs) primarily focus on long-context recall, requiring models to produce short responses based on a few critical snippets while processing thousands of irrelevant tokens. We introduce LongProc (Long Procedural Generation), a new benchmark that requires both the integration of highly dispersed information and long-form generation. LongProc consists of six diverse procedural generation tasks, such as extracting structured information from HTML pages into a TSV format and executing complex search procedures to create travel plans. These tasks challenge LCLMs by testing their ability to follow detailed procedural instructions, synthesize and reason over dispersed information, and generate structured, long-form outputs (up to 8K tokens). Furthermore, as these tasks adhere to deterministic procedures and yield structured outputs, they enable reliable rule-based evaluation. We evaluate 17 LCLMs on LongProc across three difficulty levels, with maximum numbers of output tokens set at 500, 2K, and 8K. Notably, while all tested models claim a context window size above 32K tokens, open-weight models typically falter on 2K-token tasks, and closed-source models like GPT-4o show significant degradation on 8K-token tasks. Further analysis reveals that LCLMs struggle to maintain long-range coherence in long-form generations. These findings highlight critical limitations in current LCLMs and suggest substantial room for improvement. Data and code available at: https://princeton-pli.github.io/LongProc
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
- Europe > Sweden > Stockholm > Stockholm (0.05)
- Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.05)
- (33 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.92)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.91)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Retrieval, Reasoning, Re-ranking: A Context-Enriched Framework for Knowledge Graph Completion
Li, Muzhi, Yang, Cehao, Xu, Chengjin, Jiang, Xuhui, Qi, Yiyan, Guo, Jian, Leung, Ho-fung, King, Irwin
The Knowledge Graph Completion~(KGC) task aims to infer the missing entity from an incomplete triple. Existing embedding-based methods rely solely on triples in the KG, which is vulnerable to specious relation patterns and long-tail entities. On the other hand, text-based methods struggle with the semantic gap between KG triples and natural language. Apart from triples, entity contexts (e.g., labels, descriptions, aliases) also play a significant role in augmenting KGs. To address these limitations, we propose KGR3, a context-enriched framework for KGC. KGR3 is composed of three modules. Firstly, the Retrieval module gathers supporting triples from the KG, collects plausible candidate answers from a base embedding model, and retrieves context for each related entity. Then, the Reasoning module employs a large language model to generate potential answers for each query triple. Finally, the Re-ranking module combines candidate answers from the two modules mentioned above, and fine-tunes an LLM to provide the best answer. Extensive experiments on widely used datasets demonstrate that KGR3 consistently improves various KGC methods. Specifically, the best variant of KGR3 achieves absolute Hits@1 improvements of 12.3% and 5.6% on the FB15k237 and WN18RR datasets.
- North America > United States > New Jersey > Bergen County (0.14)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > United Kingdom > England > Leicestershire > Leicester (0.05)
- (30 more...)
- Leisure & Entertainment > Sports > Soccer (0.95)
- Government > Regional Government > North America Government > United States Government (0.93)